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This talk reviews some key developments that have taken place in 
hadron-collider jet finding over the past couple of years, including: technical 
advances such as the complete formulation of an infrared safe seedless cone 
algorithm and fast computational approaches to sequential recombination 
jet finders like the kt algorithm, together with universal methods for sub- 
tracting pileup; progress in understanding the sensitivity of jet algorithms 
to the underlying event and hadronisation; and work that exploits our 
knowledge of QCD divergences to better define and predict heavy-flavour 
jet cross sections. 

1. Introduction 

Jet algorithms provide a way of projecting away the multiparticle dy- 
namics of an event so as to leave a simple quasi-partonic picture of the 
underlying hard scattering. This projection is however fundamentally am- 
biguous, reflecting the divergent and quantum mechanical nature of QCD. 
Consequently, jet physics is a rich subject. 

Key developments in the history of jet finding have often been spurred 
by advances in experimental sophistication, and in this vein, the upcoming 
startup of the LHC provides a motivation for reexamining the technology 
at our disposal. 

To appreciate what changes at LHC, consider the physics scales and 
processes at play: in addition to having the electroweak (~ 100 GeV) and 
hadronisation (0.5 GeV) scales familiar at LEP and HERA, and an un- 
derlying event (~ 10 GeV) 2-4 times larger than the Tevatron's, the LHC 
will routinely probe multi-jet events of unprecedented complexity (think 
tiH — ► 8 jets), it will suffer from huge pileup (~ 100 GeV of pt per unit 
rapidity), and it may well discover new-particle cascades that mix the TeV 
scale and electroweak scales. That's vastly more to disentangle than ever 
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before. To add to that, a key technical issue is posed by the number of 
particles: up to ~ 4000 per event, two orders of magnitude larger than at 
LEP (~ 50), and an order larger than at Tevatron. 

A programme of work to bring jet-finding up-to-date for the LHC age 
has begun over the past couple of years and involves three main phases: 
1) develop a core set of theoretically solid and experimentally practical jet 
algorithms (i.e. the 1990 Snowmass accord p]); 2) quantify, where pos- 
sible analytically, how jet algorithms respond to various non-perturbative 
and perturbative QCD effects; 3) use the resulting understanding to guide 
development of more sophisticated tools. As described below, phase 1 is 
nearing completion, and progress is being made on the remaining parts. 

2. Core tools 

The 1990 Snowmass accord [T] for the Tevatron advocated the use of jet 
algorithms that were simple to use, both theoretically and experimentally, 
well-defined and finite at all orders of perturbation theory, and relatively 
insensitive to hadronisation. However this accord has never fully been sat- 
isfied in Tevatron jet finding. 

Cone algorithms provide a top-down form a jet identification, and are 
mostly based on the idea of a stable cone, one whose direction coincides 
with that of the summed momenta of the contained particles. They are 
widespread at pp colliders, motivated on the grounds that soft and collinear 
radiation leaves stable cones unchanged, and a feature often quoted as being 
one of their main experimental advantages is their simple conical shape. 

Cone algorithms have long been plagued by infrared and collinear (IRC) 
safety issues. Old iterative cones with split-merge procedures are unreliable 
from order (or otEwctVi onwards, Tevatron run II "midpoint" types cones 
from order a* (or aEwo? s )- Unreliable at order a™ means here that they 
diverge when calculated at a" +1 — regulating that divergence around Aqcd 
introduces a term ~ Inpt/ Aqcd- This is the same size as the a™ term 
(recall a s ~ 1/ ln(p t /Aqcd)), any effort!]] that went into the a™ precision 
is swamped by the near-divergent uncalculated higher orders. 

The practical relevance of the IRC safety issue has been repeatedly ques- 
tioned, it being noted for example that in the most widely studied of jet- 
observables, the inclusive-jet spectrum, the "real" effect of IR unsafety is 
seen to be 1%. This figure however holds only for this observable: leading 
order for the jet-spectrum is a 2 s , the midpoint cone's unreliability starts at 
af, and for a s ~ 0.1 the ratio is 1%. At LHC, many interesting processes 
start at or higher, and then even leading order can be unreliable, with 
up to 50% effects, depending on the cuts. 



1 Between 50 and 100 people working over ten years, i.e. ~ $50 million. 
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Fig. 1. Left: IR safety failure rate for a range of jet algorithms in artificial events 
with between 2 and 10 hard particles (for details, see [J]). Right: speeds of various 
algorithms as a function of the particle multiplicity N. 

The design of an IRC safe cone algorithm starts with the observation 
that you should find all stable cones [2]. Ref. [3] showed how, for a handful 
of particles (in N2 N time, i.e. 10 17 years for A = 100). Recently, Ref. [1] 
reduced that to a more manageable A 2 In A. The trick was to recast it 
as a computational geometry problem, i.e. the identification of all distinct 
circular enclosures for points in 2D, and to find a (previously unknown) 
solution to that. Together with a few other minor fixes, this has led to the 
first ever IRC safe jet algorithm, SISCone (cf. left plot of fig. [[]). 

Sequential recombination algorithms (SRAs), such as kt [5], take a 
bottom-up approach to creating jets, successively merging the closest pair 
of objects in an event until all are sufficiently well separated. They work 
because of relations between the distance measures that are used and the 
divergences of QCD. Their attractions include their conceptual simplicity, 
as well as the hierarchical structure they ascribe to an event, and they were 
ubiquitous at LEP and HERA. 

There had been two major issues for SRAs in pp collisions: they used 
to be slow (~ A 3 time to cluster A particles, i.e. 1 minute for A = 4000) 
and the shape of the resulting jets was unknown and irregular, which com- 
plicated pileup subtraction. Recently the speed issue was solved [6] by 
observing a connection with computational geometry problems: e.g. the kt 
algorithm factorises into a priority queue and the problem of constructing 
a nearest-neighbour-graph in 2D and maintaining it under point changes 
(solved in [7]). Asymptotically, run times are now A In A, and in practice 
~ 20 ms for A = 4000. That's better even than a fast (but very IR unsafe) 
iterative cone algorithm such as CDF's JetClu (cf. right-plot of fig. [T|) . 

The problem of the unknown shape of SRA jets has also been solved, by 
the simple expedient of adding very many infinitely soft "ghost" particles [8j. 
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Fig. 2. Left: hadronic W and top mass peaks with pileup subtraction in high- 
luminosity semi-leptonic tt events at LHC. Right: inclusive jet spectrum in LHC 
PbPb events before and after background subtraction. Adapted from [9], simula- 
tions used Pythia [TT] and Hydjet [T2"] . 

These serve to fill in all empty space in the event and so give a well-defined 
boundary and total area to each jet. Subtracting a correction proportional 
to that area works rather well for removing pileup [9] and can even be 
applied to the extremely noisy environment of LHC PbPb collisions (fig. [2]). 
This progress, together with the the recent successful measurement of the 
inclusive jet spectrum by CDF with the kt algorithm [ID], means that all 
objections raised in the past about SRAs are now essentially resolved. 



3. Understanding and improving jet algorithms 



Once you have a set of safe, fast al- 
gorithms (all conveniently packaged in 
Fast Jet [6J), you can start trying to 
understand their physics behaviour. A 
simple question, for example, is that of 
how hadronisation and the underly- 
ing event (UE) modify a jet's transverse 
momentum. The situation is summarised 
in table [TJ [13J, whose results are essen- 
tially common to all jet algorithms with 
a radius parameter R. The distinct in- 
dependence for each effect may provide a 
way of disentangling them experimentally. 
It also implies an optimal R (minimising 
the sum of squares of effects) that varies 
significantly with the jet initiator's colour and pt, as illustrated in fig. [3l 
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Fig. 3. Simple estimate for optimal 
R as a function of jet pt, collider 
and initiating parton. 
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Table 1. Summary of the main physical effects that contribute to the average 
difference (5pt) between the transverse momentum of a jet and its parent parton 
(for small R). Ah ~ 0.35 — 0.4 GeV based on e + e~ event-shape studies [13] . 
A UE s" ~4GeV- (s/(2TeV) 2 )°- 25 



based on the results in table [TJ at large pt perturbative radiation dominates 
over other contributions, so one prefers R ~ 1, whereas at low pt the UE 
has a significant relative impact and it is advantageous to lower R to limit 
this, especially at higher energy colliders (since the UE grows with s) and 
for quark jets (for which perturbative radiation is weaker). Note that these 
results for the optimal R are mainly to be taken as indicative of general 
trends: a definitive estimate would go beyond the small-i? approximation 
and take into account the dispersion of each effect rather than its mean 
value. 

While to a first approximation the effects shown in table [I] are inde- 
pendent of the specific jet algorithm, more refined studies, for jet areas [8], 
do highlight differences between algorithms, but not always as one would 
expect. For example, with heavy pileup, the kt algorithm, often labelled a 
vacuum cleaner, actually has an average area quite close to ttR? (essentially 
because pileup is not vacuum); cone algorithms are widely assumed to have 
an area irR 2 , but modern versions with split-merge steps (e.g. SIS Cone) 
actually turn out not to be quite conical, having an area ~ irR 2 /2. This 
small area is part of the reason why they work well in noisy environments^ 
This has important implications for strategies that assume an area of ttR 2 
in correcting for pileup with cone- type algorithms. 

Perhaps the most striking example to date where a better understand- 
ing of clustering dynamics can lead to improved algorithms concerns jet 
flavour. This concept is often taken for granted (over 350 articles' titles 
contain the words 'quark jet' or 'gluon jet'), and it would seem that if one 
simply sums the flavours of all partons in a jet one might obtain a well- 
defined result for the jet-flavour. This turns out not to be the case, even 
in algorithms for which the jet momenta are IRC safe, because the flavour 
is subject to contamination by large-angle g — > qq splitting of a soft gluon, 

2 On the other hand, for hard particles, modern cone algorithms like SISCone have 
a reach that extends somewhat beyond R and this can lead to issues in resolving 
complex multi-jet events. 
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where the q and q then enter separate jets. A simple modification [16] of 
the distance measure in the kt algorithm can solve the problem and make 
the flavour IRC safe. 

A key advantage of the result- 
ing IRC safe "flavour- kt" algorithm 
emerges when talking about heavy- 
flavour jet cross sections. With unsafe 
definitions, higher orders involve pow- 
ers of large logarithms In pt/ 'mb, giving 
large NLO scale uncertainties. This is 
especially the case for current exper- 
imental 6-jet measurements, in which 
a jet containing a b and a b is con- 
sidered to be a normal 6-jet. With a 
proper, IRC safe definition most of the 
large logarithms disappear, and remain- 
ing ones can be absorbed into the par- 
ton distribution functions. The result is 
a reduction in the theory uncertainty for the inclusive fo-jet spectrum from 
~ 40 - 50% (see e.g. [17j) to the ~ 10 - 20% shown in fig. H [15J (calculated 
with a modified version of nlojet++ P2]). It should be said that while 
normal IRC safe jet algorithms involve no particular experimental issues, 
the flavour-fcf algorithm does require a jet's flavour to be taken as the sum 
of the flavours of the jet's constituents — i.e. one should be able to distin- 
guish a jet containing a b and b (not a fe-jet) from one that contains just a 
single b. This is challenging experimentally, but if it can be done it will have 
significant benefits also in reducing QCD "6" -jet backgrounds (more than 
50% of which come from bb jets) in new-physics searches. A measurement 
of the flavour-/c 4 6-jet spectrum would then provide a powerful cross-check 
that the separation of bb and b jets is being done effectively. 

4. Outlook 

Practical, infrared and collinear safe options now exist for both cone 
and sequential recombinations jet algorithms, and are in the process of be- 
ing incorporated into the LHC experiments' software frameworks. If they 
are widely adopted (not to be taken for granted given the continued pres- 
ence also of long-established unsafe options and the inertia inherent in large 
organisations), hadron-collider jet-finding will finally come into accord with 
the 1990 Snowmass principles, a key step if the LHC is to benefit from the 
ongoing and extensive calculational effort in QCD. 

There is more, of course, to jet finding than just practicality and IRC 
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safety. Most of the thought about jet algorithms, historically, has been 
for low-noise, single-scale, quark-jet dominated environments such as LEP. 
Work is now being carried out that addresses the significant novel issues 
that arise at the LHC. There is considerable scope for further work, and it 
is to be hoped that this, together with open exchanges between theorists 
and experimenters, will help us make the best possible use of jets in the rich 
environment of LHC. 
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